Introduction

This report will discuss findings in my data set. It will go over the 3 questions I chose out with the help of my professor.

To begin I will give a short summary of my data set
My data set contains all passers in the NFL from 2009-2018. This means every single player that passed the ball. So for example, if a wide reciever passed the ball for even a play, they would be registered as a passer for the season. Despite there being only 32 teams in the NFL and 10 seasons in question, we ended up with 958 datapoints. I scraped my data from a source on GitHub that had the years seperated, I had to merge and clean the data. In this report I will provide the code I used to create the data frame I used, but I will refer to is as its csv form that I created (To show that I cleaned the data, and turn in the data set I used to Brightspace).

If seeing Markdown, code for datacleaning is below, If seeing knitted report Ignore this Line

The Dataframe

Here is my dataframe fully, I will make it interactive using the datatable library in case you want to scoll through it, sort it, or fiddle with it by editing.

df <- read_csv('project_data.csv')
datatable(df)

Now I will begin with my questions for the report

Question 1

How does the age of Passers vary in the dataset Subquestion: How does it vary through the years ? (Passing Yards [Continuous] vs Team [Categorical])

To answer the first part of the question I had to make a histogram, except I did not need bins as ages in the NFL don’t vary too much, so I was able to use a bar plot using the stat = “count” function. Here’s a look at the plot (Made Interactive, in case you wanted to take a closer look):

As we can see, the majority of passers are going to be in that mid 20’s range. Like in all sports, football is a sport that is mostly comprised of young adults, in an especially physical sport like football, wear and tear is obvious, in this graph we can see a drop off in passers in the league post 28 years old. following that is a negative slope. Only the really great players stick around the league; for reference, here are all the quarterbacks post 35 years old in the dataset with more than 10 starts (games featured in). Lets just take a look at the top 7:

As we can see, these quarterbacks are all household names, only the great players make it into their 30’s in the NFL. Which explains the huge drop off from 28 to 29 years old.

Despite this, the NFL consistently has about the same average age at Quarterback, this is because of the NFL Draft: Every year College Prsopects enter the NFL as rookies, and older players retire. Here is a look at how the minimum, mean, and maximum age of Quarterbacks vary over the years in my dataset.

As we can see there aren’t a whole lot of changes, and ages of quarterbacks tend to average out at 25-28 years old consistently year to year. The only thing there is a bit of change at is in the maximum age, but this is becuase of how hard it is to stay healthy in a contact sport like football.

Conclusion for Question 1:

We can conclude that the answer to our question: How does the age of Passers vary in the dataset Subquestion: How does it vary through the years ? is that age in passers tends to average out in the 25-28 years old range, and the quarterbacks who do stick around into the late years are usually the star Quarterbacks who are household names.

Question 2

Is there a relationship between a quarterback’s Sacks and their total passing Yards + Rushing Yards? (Sacks [continuous] vs Total Yards [Continuous])

This Question is very clear and concise, how do sacks affect the total yards of a quarterback ? The easiest way to show this is to put it on a graph, So here is an interactive scatterplot to give us just that.

In this scatterplot we can see that there is oddly a positive relationship between Sacks and Total Yards (The correlation was .8), I decided I would alter my code so I could see the average sacks per game compared to total yards which is when I got this:

See this graph makes more sense (correlation = 0.19), As we can see not being sacked more than about 3 times a game helps quarterbacks make better decisions and have better seasons. Except, when looking at my data I actually found a Caveat, If we look at the top 50 Quarterbacks in the dataset, we actually come to find out that there is no relationship between sacks and the Yardage performance of the season.

This finding is incredible, it goes on to prove what we hear in the media all the time: “When you have the right quarterback nothing matters. Good Quraterbacks make with what they have which is why we see quarterbacks who average more than 3 sacks still be successful enough to be in this graph like Aaron Rodgers and also make better calls at the line of scrimmage, for example: Peighton Manning is regarded as one of if not the best Pre-snap Quarterbacks of all time, so it makes sense that his successful campaign with the Broncos had him getting 5597 total yards with just a 1.125 Sacks per game.

Conclusion for Question 2:

We can conclude that Sacks absolutely matter, it is extremely hard to succeed as a Quarterback especially if you average more than 4 sacks a game. On the other hand though, great Quarterbacks make it easier to succeed despite struggles with protection. The better Quarterback you have, the less that it matters whether or not the offensive line is good enough to not allow sacks.

Question 3

Which quarterbacks led the most game-winning drives over the decade, and how does their success compare to their overall performance metrics ? (Game Winning Drives [Continuous] Vs Player [Categorical] Vs Other Continuous Metrics)

To begin to answer this questions I get striked with the curiosity. Who are the most successful quarterbacks when it comes to Game Winning Drives ? Here’s a datatable to give you the top 5 from 2009 - 2018

Now Lets take a look at how these 5 Quarterbacks vary from eachother in the other statistical categories:

We can see that Drew Brees who leads in Game Winning Drives Leads in Quarterback Rating and Average Touchdowns per Year, but Matthew Stafford who ranked second doesn’t come in second in those categories. So perhaps there isn’t a direct correlation to statistical success when a quarterback is clutch, How about we relate it to games won ? lets take these 5 quarterbacks and see if the games they won correlate to their success in game winning drives.

This gives us an indicator that there is a relationship between game Winning Drives and Wins. Many call the NFL the “Any Given Sunday League” Which means any given Sunday could be a win or a loss because of how close competition is. This Graph Shows that There are indeed a lot of close games because Game Winning Drives means you are going on to win by one score.

Conclusion for Question 3

We can conclude that in some cases Game Winning Drives can directly translate to statistical success, but Game Winning Drives Correlates more directly with Total Wins over the Year. So the more clutch a Quarterback is, the more likely they are to have a good season. As far as statistical success, I believe that it comes down to, the better the Quarterback is the more likely they are to show up big when the game is on the line.

Thank You

Thank you for looking at my Passer Report. We came to many solutions with evidence shown in graphs, I hope you have a better understanding of Quarterbacks in the NFL. Thank You !